Skip to main content
ICT
Lesson A19 - Searches: Sequential & Binary
 
Main Previous
Title Page >  
Summary >  
Lesson A1 >  
Lesson A2 >  
Lesson A3 >  
Lesson A4 >  
Lesson A5 >  
Lesson A6 >  
Lesson A7 >  
Lesson A8 >  
Lesson A9 >  
Lesson A10 >  
Lesson A11 >  
Lesson A12 >  
Lesson A13 >  
Lesson A14 >  
Lesson A15 >  
Lesson A16 >  
Lesson A17 >  
Lesson A18 >  
Lesson A19 >  
Lesson A20 >  
Lesson A21 >  
Lesson A22 >  
Lesson AB23 >  
Lesson AB24 >  
Lesson AB25 >  
Lesson AB26 >  
Lesson AB27 >  
Lesson AB28 >  
Lesson AB29 >  
Lesson AB30 >  
Lesson AB31 >  
Lesson AB32 >  
Lesson AB33 >  
Vocabulary >  
 

LAB ASSIGNMENT A19.3 page 9 of 9

CountWords

Background:

  1. This lab assignment will count the occurrences of words in a text file. Here are some special cases that you must take into account:

    Special Cases
    Explanation
    hyphenated words (i.e., sixty-three) Count as one word
    hyphenated words with blank spaces on each side of hyphen (i.e., joyous - sparkling) Count as two words
    apostrophed words (i.e., 'tis, or can't) Count as one word
    upper and lower case (i.e., The and the) Both count as occurrences of the word 'the'. Convert any capital letters to lower case before counting such words.

  2. You are encouraged to use a combination of all the programming tools you have learned so far, such as:

    Data Structures
    Algorithms
    Array classes
    String class
    ArrayList class
    sorting
    searches
    text file processing

Assignment:

  1. Your instructor will provide you with a data file (such as test.txt, Lincoln.txt, or dream.txt) to analyze. Parse the file and print out the following statistical results:

    - Total number of unique words used in the file.
    - Total number of words in a file.
    - The top 30 words which occur the most frequently, sorted in descending order by count.

    For example:

 1    103   the
 2     97    of
 3     59    to
 4     43    and
 5     36    a

 6     32    be
 7     32    we
 8     26    will
 9     24    that
10     21    is

... rest of top 30 words ...

Number of words used = 525
Total # of words = 1577

 

Main Previous
Contact
 © ICT 2006, All Rights Reserved.